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REVEAL:  Reconstruction,  Enhancement,  Visualization,  and  Ergonomic 
Assessment  for  Laparoscopy 

2005  Annual  Report 


1.  Introduction 

Information  cues  available  in  laparoscopy  and  other  forms  of  minimally  invasive  surgery  are  impoverished  relative 
to  cues  available  in  open  surgery.  Acquiring  surgical  skill  in  such  an  environment  is  extremely  challenging.  Even 
after  mastery,  continued  practice  can  lead  to  problems  for  the  surgeon  as  indicated  by  frequent  incidence  of  pain  and 
injury  associated  with  laparoscopy.  The  long-term  impact  on  the  surgeon  performing  these  procedures  is  largely 
unknown. 

The  goal  of  this  work  is  to  develop  and  test  new  technologies  that  will  break  down  the  barriers  that  block  more 
surgeons  from  attaining  and  continuing  to  practice  (without  injury  or  pain)  high  levels  of  skill  in  MIS.  This  project 
will  develop  new  technology  by  concentrating  on  three  major  research  thrusts: 

•  Smart  Image:  the  project  will  develop  and  evaluate  new  approaches  for  extracting,  fusing,  and  presenting 
information  cues  from  imagery  and  other  data  sources. 

•  Configurable  Display:  the  project  will  develop  new  approaches  for  presenting  existing  data  (video,  CT 
data)  and  extracted  cues  (3D  reconstruction,  haptic  cues,  etc.)  to  the  user  within  a  flexible,  configurable 
display  environment 

•  Ergonomic  Assessment:  the  project  will  use  existing  technology  and  build  new  techniques  as  needed  to 
acquire  crucial  ergonomic  data  relative  to  key  factors  of  patient  position,  technology  configuration,  and 
instrument  design. 


2.  Major  Accomplishments 

In  this  section  we  provide  a  functional  view  of  major  tasks  accomplished  during  the  2005  project  year.  These 
include  (1)  deployment  of  the  REVEAL  display  system  and  tool  suite  in  the  University  of  Maryland  Medical 
Center’s  Simulation  Center,  (2)  REVEAL  tool  suite  improvements,  (3)  stereo  video  display  technology  upgrades,  (4) 
stereo  probe  calibration,  (5)  performance  modeling  and  analysis,  and  (6)  enhanced  experimental  tools  for  cognitive 
ergonomics  experiments. 

Deployment  of  the  REVEAL  Display  System  at  UMMC  Simulation  Center 

A  combined  motion  capture/immersive  display  system  has  been  deployed  in  the  Simulation  Center  at  the  University 
of  Maryland  Medical  Center.  The  system  was  deployed  in  a  decommissioned  OR  in  the  South  hospital  that  is  being 
refit  for  use  as  a  simulation,  training  and  research  center.  The  deployed  system  consists  of  custom  fixtures,  a  self¬ 
calibrating  immersive  display  similar  to  that  deployed  in  the  UK  REVEAL  lab,  and  a  Vicon  motion  capture  system. 
The  system  supports  the  ergonomic  assessment  portion  of  REVEAL  at  the  University  of  Maryland. 

Custom  Fixtures.  A  custom-built  truss  system  provides  a  flexible  mounting  system  for  locating  cameras,  projectors 
and  Vicon  sensors  at  optimal  positions  within  the  laboratory  environment.  The  environment  can  be  reconfigured 
rapidly  based  on  the  demands  of  a  particular  experiment  using  quick-release  clamps. 

Self-Calibrating  Immersive  Display.  A  self-calibrating  immersive  display  with  a  curvilinear  rear-projection  screen 
for  use  in  monocular  and  stereoscopic  display  configurations  was  deployed.  The  basic  hardware/software 
environment  is  similar  to  that  deployed  in  the  University  of  Kentucky’s  REVEAL  laboratory.  Changes  have  been 
made  where  the  unique  constraints  of  deployment  in  a  simulated  OR  have  diverged  from  our  general  purpose 
laboratory  set-up.  In  particular,  the  system  uses  a  smaller  wrap-around  screen  designed  to  maximize  the  immersive 
display  experience  for  the  surgical  team  in  their  customary  positions  during  a  procedure. 


Fig.  1:  The  subject  on  the  left  is  being  tracked  by  the  Vicon  camera 
system.  The  system  detects  joint  positions  and  can  estimate  angles  in 
a  unified  3D  coordinate  frame. 

Equipment  and  Environment:  Vicon  motion  capture  system.  A  Vicon  motion  capture  system  is  on-line,  providing 
sub-millimeter  precision  for  the  location  of  objects  in  three-dimensional  space.  This  system  is  being  used  to  study 
instrument  and  joint  positions  during  laparoscopic  procedures  to  assess  the  physical  requirements  of  laparoscopy 
(Fig.  1).  This  data  will  be  used  to  produce  assessment-oriented  data  describing  the  actual  demands  of  current 
practice,  and  to  design  improved  surgical  gestures  that  reduce  the  physical  demands  of  the  tasks. 

This  Vicon  system  consists  of  12  high-speed,  high-resolution,  infrared,  digital  cameras.  For  custom  installation  of 
these  cameras,  a  truss  system  was  set  up.  These  cameras  have  been  positioned,  aimed,  and  focused  to  optimize  the 
size  of  motion  capture  volume  and  the  recognition  of  the  9.5mm  markers  coated  with  retro-reflective  materials  that 
are  placed  on  experiment  participants.  Two  force  plates  purchased  from  AMT1  (Advanced  Mechanical  Technology, 
Inc.,  Watertown,  MA)  have  been  placed  on  the  floor  of  the  lab.  Analog  force  and  moment  data  are  captured, 
synchronized,  and  stored  through  ViconPeak’s  analog  data  capture  system.  Using  a  mini-DV  camcorder,  this  system 
also  captures  images  of  upper-body  movement  and  stores  them  with  motion  capture  data.  An  additional  video 
capture  device  is  used  to  record  endoscopic  images  used  for  monitoring  instrument  movement  in  the  trainer  box  and 
evaluating  surgical  performance.  A  16  channel  electromyography  (EMG)  recorder  has  been  purchased  to  monitor  the 
timing  and  relative  amplitude  of  muscle  activities.  This  EMG  system  uses  pre-amplifiers  integrated  into  electrodes 
so  signal-to-noise  input  ratio  is  enhanced. 

Initial  Experiments:  Pegboard  transfer,  pattern  cutting,  endo-loop  placement,  and  suturing/intracorporeal  knot  tying, 
four  of  the  five  tasks  that  comprise  the  Fundamentals  of  Laparoscopic  Surgery  (FLS)-the  official  examination 
program  used  by  the  Society  of  American  Gastrointestinal  Endoscopic  Surgeons  (SAGES)-are  being  and  have  been 
used  in  our  experiments  (see  example  endoscopic  images  in  Fig.  2).  Numerous  researchers  have  already  validated 
well  that  strong  correlations  between  test  scores  and  surgical  levels  can  be  obtained  through  performance  analysis  of 
these  FLS  tasks.  Seven,  right-handed  surgeons  with  different  levels  of  minimally  invasive  surgery  (MIS)  experience 
were  recruited  to  perform  the  tasks.  During  the  experiment  the  surgeons  stood  with  one  foot  on  each  force  plate.  So 
that  surgeons  could  maintain  the  correct  elbow  joint  angle  while  holding  surgical  instruments  at  rest,  the  surgical 
trainer  box  was  mounted  on  a  height-adjustable  platform.  A  standard  CRT  monitor  that  displayed  endoscopic  images 
from  zero-degree  scope  was  located  at  eye  level  in  front  of  the  participants.  Wearing  medical  scrub,  the  surgeons  had 
39  reflective  markers  placed  on  body  landmarks  so  that  their  body  movements  could  be  reconstructed  using  motion 
capture  technique.  Marker  placement  followed  the  ViconPeak  guidelines  for  the  Plug-ln-Gait  (PIG)  model. 
Movements  of  body  segments  were  captured,  and  joint  movements  were  shown  in  three  rotations  -  flexion/extension, 
abduction/adduction,  and  internal/external.  Force  plates  recorded  data  of  ground  reaction  forces  and  moments  to 
provide  information  for  postural  stability  analysis. 


Fig  2:  Endoscopic  views  of  baseline  tasks 


Current  Experimental  Research  Outcomes 

Optimizing  joint  kinematics  will  most  likely  allow  MIS  surgeons  to  achieve  better  surgical  performance.  Joint 
kinematics  characterized  by  range  of  motion  (ROM),  mean  joint  angle  (MJA),  and  mean  joint  movement  amplitude 
(MJMA)  were  correlated  to  performance  time  during  the  FLS  pegboard  transfer  task.  MJA  varied  with  different 
performance  skill.  Participants  requiring  the  most  time  to  perform  showed  more  mean  flexion  angles  (r=,684,  P<.05) 
at  the  left  elbow  while  maintaining  approximately  90  degrees  at  the  right  elbow.  Regarding  the  left  wrist,  more 
skilled  participants,  requiring  the  least  time,  showed  more  external  rotations  (r=.680,  p<.05)  while  less  skilled 
subjects  maintained  the  neutral  position.  Less  skilled  subjects  showed  more  external  rotation  at  the  right  wrist  (r=- 
.751,  p<.05).  ROM  and  MJMA  did  not  differentiate  performance  skill  levels.  This  study  suggests  the  development 
of  further  investigations  on  joint  movement  patterns  to  formulate  joint  control  strategies  for  optimal  laparoscopic 
surgery  training  [4]. 

It  is  very  important  for  MIS  surgeons  to  maintain  proper  postural  stability  for  better  surgical  performance.  Postural 
stability  was  correlated  to  surgical  skill  level  represented  by  performance  time  during  pegboard  transfer,  pattern 
cutting,  and  endo-loop  placement  tasks.  Center  of  Pressure  (COP)  was  derived  separately  from  each  force  plate  and 
then  combined  to  obtain  overall  COP  which  showed  anterior-posterior  (A-P)  and  medial-lateral  (M-L)  sway.  It  was 
found  that  each  FLS  task  required  unique  postural  control  adjustment.  More  experienced  participants  showed  smaller 
COP  excursion  in  A-P  direction  during  pegboard  transfer  (r=.912,  p<.05)  and  in  M-L  direction  during  pattern  cutting 
(r=7.888,  P<.05).  During  endo-loop  placement,  COP  excursion  was  inversely  correlated  with  performance  time  (r=- 
.884,  P<.05,  r=-,824,  P<.05).  This  study  emphasized  that  optimized  ergonomics  should  be  determined  by  individual 
task  [5], 

When  suturing/intracorporeal  knot  tying,  the  most  difficult  of  the  FLS  tasks,  was  evaluated,  joint  ROM  was  used  to 
characterize  joint  kinematics.  During  this  task,  it  was  found  that  more  skilled  surgeons  relied  less  on  shoulder 
movement  than  less  skilled  participants.  Expert  surgeons  showed  smaller  ROM  at  the  dominant  wrist  and  greater 
ROM  at  the  non-dominant  wrist.  This  research  serves  as  the  starting  point  of  more  detailed  analysis  of  surgical 
movement  that  characterizes  the  joint  movement  and  joint  coordination  of  expert  surgeons  [6]. 

Previous  studies  in  surgical  ergonomics  have  shown  that  instrument  usage,  task  difficulty,  and  subject  skill  level  can 
be  correlated  to  postural  stability.  However,  these  studies  did  not  consider  the  possibility  that  surgeons  may 
strategically  change  their  stance  or  joint  movement  to  achieve  better  surgical  outcomes  while  potentially  subjecting 
themselves  to  greater  kinematic  risk.  In  our  study,  one  highly  experienced  and  skilled  surgeon  reported  the 
development  of  carpal  tunnel  syndrome  in  both  of  his  wrists,  highly  experienced  and  skilled  surgeon  Still,  this 
participant  was  able  to  finish  both  the  pegboard  transfer  and  pattern-cutting  tasks  significantly  faster  than  others, 


within  a  minute  for  each  task.  To  minimize  wrist  flexion  during  the  pegboard  transfer  task,  the  surgeon  increased 
the  abduction  angle  of  his  shoulder  so  that  his  hand  and  forearm  aligned.  During  pattern  cutting,  the  subject 
maintained  his  lower  body  position  and  stance  while  twisting  his  torso  in  a  strategy  that  appeared  to  stabilize  a 
tangential  direction  in  relation  to  the  cutting  while  maintaining  a  fixed  orientation  of  forearm,  wrist,  and  hand.  In  a 
different  trial  when  circle-cutting  was  the  task,  the  subject  changed  his  stance  primarily  by  shifting  foot  position  as 
needed  in  order  to  obtain  better  approach  angles  for  the  scissors.  These  compensatory  and  strategic  movements 
caused  increase  in  his  overall  postural  sway,  yet  they  did  not  necessarily  represent  postural  instability.  This  case 
study  demonstrated  that  poor  postural  stability  or  joint  kinematics  do  not  necessarily  correlate  to  poor  performance 
but  may  instead  be  positive  compensatory  or  strategic  movements.  Therefore,  background  information  about 
participants,  which  might,  for  instance,  include  joint  impairment,  should  be  considered  as  important  ergonomic 
elements,  the  correlations  of  which  may  lead  to  more  accurate  and  specific  conclusions  about  optimal  postural 
stability  and  joint  kinematics  for  minimally  invasive  surgeons  [7]. 

Custom  Development  within  Experimental  Setup:  Markers  must  be  placed  so  that  the  best  camera  recognition  and 
body  segment  definition  are  obtained.  For  our  experiments,  the  ViconPeak  marker  placement  guidelines  were 
followed.  The  ViconPeak  Plug-ln-Gait  marker  set  was  originally  designed  for  analysis  of  lower  and  upper  body 
motion  in  conventional  situations.  Expecting  that  more  obstacles  would  be  located  between  markers  and  cameras  in 
a  surgical  environment,  we  have  developed  custom  marker  placement  that  is  achieved  by  using  clustered  markers. 
For  better  data  capturing,  three  or  four  markers  are  grouped  together  and  attached  to  a  body  segment  toward  which 
the  cameras  are  pointed.  To  define  a  segment,  there  should  be  three  markers  in  each  segment.  When  one  marker  of  a 
segment  is  lost  during  data  collection,  the  segment  cannot  be  defined  and  biomechanical  model  stops  working. 
Therefore,  one  or  more  extra  markers  in  each  segment  can  be  used  to  missing  marker  problem  and  this  also  supports 
the  need  of  custom  marker.  The  Plug-In-Gait  biomechanical  model  that  calculates  kinematic  data  including  joint 
angles  cannot  be  used  as  is  with  a  custom  marker  set.  The  International  Society  of  Biomechanics  (ISB)  just  recently 
published  an  article  suggesting  new  marker  sets,  segment  definitions,  and  angle  calculations  [8],  Plug-In-Gait  has 
been  shown  to  cause  a  well-known  surgical  movement  problem  called  ‘gimbal-lock’,  which  is  a  unique  angle  of  the 
shoulder  joint.  Therefore,  a  biomechanical  model  that  incorporates  new  ISB  recommendations  and  custom  marker 
sets  is  now  being  developed  here  as  a  part  of  REVEAL  project. 


REVEAL  Tool  Suite  Improvements 

REVEAL  tool  suite  development  efforts  continued  with  an  emphasis  on  deployment  tasks  and  usability.  New 
features  deployed  include  a  graphical  user  interface  for  the  VIBE  display  system,  an  integrated  video  display 
application  for  endoscopic  video  streams,  support  for  digital  video  streams,  and  improved  installation  tools. 

Graphical  User  Interface.  The  REVEAL  development  team  has  implemented  a  graphical  user  interface  (GUI)  to 
control  the  self-calibrating  display  system.  Previously,  a  set  of  shell  scripts  had  to  be  executed  at  the  command  line 
to  carry  out  calibration,  display  and  application-driver  functions.  These  functions  are  now  included  in  an  intuitive 
GUI  that  non-technical  staff  can  use  to  control  the  system  using  mouse-clicks  on  descriptive  drop-down  menus  and 
buttons. 

Integrated  video  display  application  for  endoscope  video  streams.  An  OpenGL  application  that  will  read  video 
input  streams  from  endoscope  controller  outputs  has  been  created  that  handles  either  monocular  or  stereoscopic 
video  input.  This  application  enables  the  display  of  mono-  or  stereo  image  streams  using  the  self-calibrating  VIBE 
display  system. 

Digital  video  support.  The  range  of  input  devices  handled  by  the  streaming  video  application  has  been  expanded  to 
include  digital  video  input.  Previously  input  had  to  be  supplied  through  an  analog  connection  to  a  WinTV  interface 
card.  This  is  a  proprietary  TV  tuner  card  that  severely  restricted  the  types  of  cameras  that  could  integrated  with  our 
system.  The  addition  of  digital  video  support  allows  data  to  be  input  via  a  IEEE  1394  (FireWire)  interface  from  any 
FireWire  capable  video  device.  FireWire  is  the  most  common  standard  in  use  today  for  digital  video  input  to 
computers. 


Installation  tools.  A  set  of  hands-off  installation  tools  were  created  that  simplify  the  installation  of  the  REVEAL 
tools  on  a  target  system.  The  tools  detect  the  existing  configuration  on  the  target  system  and  make  necessary 


connections  between  existing  resource  locations  and  the  new  system,  as  well  as  notifying  the  installer  of  any  missing 
prerequisite  packages. 


Stereo  Video  Display  Technology  Upgrades 

During  2005  we  made  two  changes  to  the  equipment  of  our  laboratory  that  significantly  improved  the  quality  of 
displayed  stereo  images.  These  included  the  use  of  glass  filter  material  for  polarizing  the  left  and  right  channels,  and 
the  purchase  of  polarity-preserving  screen  fabric. 

Glass  filters.  The  original  plastic  polarizing  filters  that  we  used  to  project  independent  left  and  right  visual  channels 
using  polarized  light  proved  inadequate  for  our  application.  This  was  due  largely  to  the  extreme  heat  generated  by 
the  COTS  DLP  projectors  we  are  using.  During  2005  we  replaced  the  plastic  filters  with  glass  filter  material, 
eliminating  problems  of  distortion  of  the  filter  surface  and  eliminating  the  risk  of  fire  from  burning  filter  material. 

Polarity’ presei-ving  screen.  Our  rear-projection  screen  was  upgraded  using  polarity-preserving  screen  fabric  that 
significantly  improves  channel  separation  for  projected  stereo  images. 

Stereo  Probe  Calibration 

A  Stereoscopic  Endoscope  is  an  endoscope  with  two  optical  paths,  either  separate  or  shared,  creating  two  images 
related  to  one  another  by  a  measurable  disparity  shift.  Such  an  endoscope  can  be  used  to  generate  a  stereoscopic 
view  for  a  surgeon,  as  with  the  DaVinci  robot  in  use  today.  In  order  to  use  such  an  endoscope  for  metric 
measurement  of  structures  in  the  operative  field,  it  is  necessary  to  calibrate  the  dual  optical  paths  according  to  a 
camera  model.  Once  calibrated,  it  is  possible  to  use  stereo  reconstruction  in  order  to  recover  Euclidean  metric 
measurements  from  the  endoscopic  images. 

This  measurement  capability  is  extremely  valuable  in  a  number  of  contexts  where  it  is  otherwise  difficult  to  gauge 
the  size  and  scale  of  the  operative  field.  For  example,  Fig.  3  shows  an  image  from  a  laparoscopic  ventral  and 
incisional  hernia  repair  where  a  small  patch  of  mesh  is  sewn  into  the  abdominal  wall  to  repair  the  defect  that  allowed 
the  herniation  to  occur.  Intraoperatively,  the  size  of  the  defect  must  be  determined  so  that  a  mesh  patch  of  the  proper 
size  can  be  introduced  into  the  surgical  site  through  a  trocar.  The  determination  of  the  dimensions  of  the  defect  is 
currently  performed  using  a  tape  measure,  manipulated  using  graspers.  This  step  requires  the  introduction  and 
removal  of  the  tape  to  perform  the  measurement.  Using  a  stereoscopic  laparoscope  and  real-time  reconstruction  of 
the  three-dimensional  anatomy  allows  such  measurements  to  be  taken  on  the  imagery  using  virtual  measuring  tapes. 


Fig.  3:  Laparoscopic  view  of  the  measurement  tape  for  hernia  repair 


In  this  work  we  report  calibration  results  for  a  stereoscopic  endoscope  that  support  the  ability  to  make  instantaneous 
measurements  in  the  image  from  a  single  stereo  pair.  Our  initial  experiments  also  indicate  that  a  the  stereo 
measurement  accuracy  can  be  improved  by  combining  the  estimates  from  stereo  pairs  with  monocular-view 
structure-from-motion  estimates  derived  from  tracked  features  over  a  number  of  frames. 

Our  work  yields  metric  information  to  reduce  the  difficulty  in  estimating  sizes  without  the  need  for  a  reference 
object  in  the  scene  or  an  external  tracking  system.  The  stereoscopic  system  can  provide  metric  information  even 
when  only  one  image  stream  from  the  scope  is  in  fact  necessary  during  the  procedure  (if  there  is  a  situation  where 
stereo  display  for  the  surgeon  is  not  a  requirement). 

The  lens  system  of  the  stereoscopic  endoscope  where  both  views  use  the  same  optical  path  creates  a  calibration 
challenge  because  of  the  difficulty  in  modelling  the  system  directly  as  a  pinhole.  We  have  developed  a  staged 
calibration  process  that  first  removes  global  non-linear  distortion  from  the  image  by  calculating  an  optimized  global 
solution  to  a  polynomial  radial  distortion  model.  We  use  three  free  parameters  in  the  model,  including  two 
parameters  for  the  radial  decentering  and  a  third  radially  symmetric  coefficient.  The  first  stage  uses  images  of 
straight  lines  in  order  to  solve  for  an  optimal  set  of  parameters  that  minimizes  the  global  distortion  according  to  the 
model.  The  constraint  is  that  straight  lines  in  the  scene  must  remain  straight  in  the  image  under  the  pinhole 
perspective  projection. 

The  solution  to  the  distortion  model  allows  the  input  images  to  be  unwarped  according  to  the  parameters  of  the  radial 
distortion  model,  which  serves  as  input  to  the  second  stage  of  the  calibration.  This  stage  uses  known  fiducials  on 
targets  in  order  to  solve  a  system  of  equations  for  the  intrinsic  parameters  of  the  camera.  Once  this  optimization  has 
been  completed,  it  is  possible  to  use  the  two  calibrated  optical  paths  for  stereo  matching  (in  a  single  corresponding 
frame)  and  3D  reconstruction. 

Using  a  single  stereo  pair,  a  matching  structure  yields  a  3D  point.  We  augment  this  measurement  with  a  set  of 
equations  over  multiple  frames  that  assume  the  matching  structure  can  be  tracked  for  some  set  of  frames.  By 
combining  2D  and  3D  constraints  as  position  estimates  from  stereo  and  estimates  from  2D  feature -based  structure- 
from-motion  equations  we  are  able  to  achieve  a  tighter  bound  on  the  accuracy  of  the  measurement  process. 


Results  from  each  stage  of  the  calibration  process  show  (1)  how  distortion  is  removed  from  the  stereo  pair;  (2)  the 
intrinsic  parameters  calculated  from  the  unwarped  images  of  known  fiducial  patterns;  (3)  the  error  in  the  3D 
reconstruction  of  known  points  in  the  scene;  and  (4)  estimates  of  the  relative  error  for  measurements  in  the  operative 
field  at  known  distances  from  the  scope. 

With  respect  to  (1),  once  the  global  unwarp  is  applied  to  the  image  the  mean  error  in  pixel  coordinates  is  5  (reduced 
from  as  much  as  20-30  for  many  scopes).  The  intrinsic  parameters  calculated  from  these  unwarped  images  yield 
projection  matrices  with  mean  reprojection  errors  of  3  pixels  (for  the  set  of  input  values).  Using  these  matrices  for 
stereo  reconstruction,  the  mean  error  in  the  3D  position  of  reconstructed  points  is  8  mm  near  the  center  of  the  image 
field  to  15  mm  near  the  edge.  These  measurements  are  relative  to  depths  across  the  working  volume  of 
approximately  125  mm.  The  results  indicate  that  bench  calibration  of  stereoscopic  endoscopes  can  be  done  to  a 
degree  that  is  good  enough  to  make  instantaneous  measurements  for  procedures  like  hernia  repairs  and  estimates  of 
sizes  and  areas  of  regions  of  interest.  Errors  in  measurements  from  stereoscopic  pairs  alone  lead  us  to  examine  the 
fusion  of  data  over  multiple  frames  using  structure-from-motion  in  order  to  narrow  the  error  profile  and  extend  the 
applicability  of  the  feature  to  micro  features.  Our  preliminary  results  in  the  fusion  of  measurements  (using  multiple 
frames  and  structure-from-motion)  indicate  we  can  narrow  the  accuracy  to  a  mean  error  of  4  mm  across  the  image 
field. 

Bench  calibration  of  stereoscopic  endoscopes  can  provide  a  valuable  way  to  make  in-the-image  instantaneous 
measurements  from  a  single  stereo  pair  with  enough  accuracy  to  save  time  in  certain  procedures  where  metric 
measurements  are  necessary  for  making  decisions  and  recording  anomalies.  Errors  in  reconstruction  are  large 
enough  that  it  warrants  continued  work  on  calibration  methods  and  integration  of  second-order  measurement 
equations  (e.g.,  structure  from  motion,  structured  light)  in  order  to  narrow  the  error  profile. 


Performance  Modeling  and  Analysis 

In  accordance  with  stated  year  two  objectives,  the  creation  of  performance  models  and  analysis  of  system 
performance  were  carried  out  for  the  display  system.  Image  latency  and  image  quality  are  critical  factors 
determining  end-user  perceived  quality  for  the  display  of  surgical  images,  so  we  focused  on  these  two  areas:  Image 
latency  and  image  quality. 


Fig.  4:  Display  System  Latency  Model  with  Experimental  Timings 

Figure  4  illustrates  the  chain  of  processing  steps  from  initial  world  event  ( i.  e. ,  any  action  in  a  scene  observed  by  the 
camera)  to  final  display.  Experimentation  with  our  current  configuration  yielded  the  quantitative  latency  values 
shown  in  the  figure.  Overall  latency  is  currently  running  120-160  ms.  A  generally  accepted  goal  for  latency  is  100 
ms  or  less,  thus  there  is  room  for  improvement  in  our  current  performance. 

The  area  most  likely  to  yield  significant  improvement  is  the  image  capture  process  at  the  beginning  of  the  sequence 
of  actions.  As  shown,  20-40  ms  of  latency  is  incurred  just  capturing  and  processing  the  image  within  the  digital 
camera.  Another  65-85  ms  accumulates  in  the  frame  grabbing  process  followed  by  approximately  10  ms  of  latency 
for  the  OpenGL  video  capture  and  display  application. 
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Fig.  5:  Display  System  Color  Reproduction  Model 


Figure  5  presents  a  preliminary  model  of  image  processing  steps  that  may  impact  color  reproduction.  The  accuracy 
of  color  reproduction  is  critical  in  the  display  of  surgical  images  because  variations  in  tissue  coloring  can  indicate 
boundaries  between  anatomical  structures,  or  they  can  indicate  a  boundary  between  healthy  and  diseased  tissues.  At 
present  the  experimental  measurement  and  evaluation  of  color  reproduction  in  our  system  is  an  open  problem  that  we 
intend  to  address  during  Year  3. 

Enhanced  Experimental  Tools  for  Cognitive  Ergonomics  Experiments 

Our  current  work  recognizes  the  need  to  emphasize  both  physical  and  cognitive  ergonomics  during  the  development 
and  assessment  of  new  visualization  tools.  There  has  been  little  precedent  for  the  use  of  cognitive  assessment  tools  in 
the  context  of  laparoscopic  surgery;  therefore,  we  have  been  devoting  considerable  effort  to  developing  and 
evaluating  such  outcome  measures.  These  tools  are  specifically  designed  to  determine  whether  technological 
innovations  in  the  operating  room  have  the  following  desired  effects.  Do  they  reduce  the  surgeon’s  mental 
workload?  Do  they  reduce  perceived  stress?  And  do  they  enhance  the  surgeon’s  situation  awareness? 


The  mental  workload,  stress,  and  situation  awareness  measures  that  we  are  currently  testing  must  meet  several 
criteria.  They  must  be  reliable  and  sensitive  enough  to  allow  assessments  involving  relatively  few  research 
participants  (ultimately  surgeons).  They  must  also  be  easy  to  implement  in  the  surgical  environment.  Finally,  they 
must  be  accepted  by  the  surgeons  whose  performance  will  be  measured.  To  date,  our  research  has  focused  on 
developing  appropriate  mental  workload  measures  and  testing  them  for  sensitivity.  However,  we  are  also  exploring 
the  application  of  an  existing  stress  measure,  the  Short  Stress  State  Inventory  (SSSI),  and  we  are  developing  a  novel 
situation  awareness  measure  that  involves  testing  the  surgeon’s  ability  to  recover  from  interruptions. 

Experimental  Environment 

During  2005  a  second-generation  software  tool  for  cognitive  ergonomics  experiments  was  developed  and  deployed 
in  the  REVEAL  lab.  The  new  program  represents  a  significant  step  toward  realistically  modeling  laparoscopic 
surgical  tasks  for  psychological  study. 


Fig.  6:  Display  of  Enhanced  Secondary  Task  Program 


Figure  6  presents  the  new  display  layout  using  the  tiled  projected  display  to  present  images  from  the  laparoscope. 

The  primary  task  is  now  a  standard  minimally  invasive  surgery  task  involving  manipulation  of  objects  inside  a  black 
box  using  graspers.  The  secondary  task  is  now  confined  to  a  pre-defined  region  (shown  as  a  blue  circle  in  the 
figure). 

A  further  step  toward  realism  was  the  elimination  of  computer  mouse  input.  Surgeons  typically  use  voice  response 
systems  to  interact  with  computer  systems  intra-operatively,  and  we  have  employed  that  technology  for  identification 
of  secondary  task  events.  Now,  instead  of  clicking  a  mouse  button  to  signal  the  appearance  of  a  secondary  task  item, 
the  experimental  subject  simply  speaks  into  a  microphone  which  converts  their  voice  into  an  input  signal  that  the 
computer  recognizes  as  a  secondary  task  recognition  event. 


Development  and  Selection  of  Workload  Measures 

Our  initial  evaluation  of  mental  workload  measures  took  place  in  fall  2004.  The  NASA-TLX,  a  subjective  measure 
composed  of  six  rating  scales  and  formerly  validated  in  aviation  environments,  was  evaluated  using  a  simple  manual 
control  task  that  required  precise  positioning  movements  of  a  cursor  on  a  computer  monitor.  We  manipulated 
workload  by  inserting  various  lag  times  into  the  response  of  the  display  to  control  inputs  by  the  subjects.  Results 
indicated  that  the  measure  was  sensitive  to  the  increased  effort  required  to  manipulate  the  cursor  with  lags  as  small 
as  250  msec.  This  was  true  even  though  research  participants  were  often  unaware  that  lags  of  this  duration  were 
even  present.  In  addition,  the  NASA-TLX  was  able  to  detect  the  statistically  reliable  workload  increments  with  as 
few  as  four  research  participants. (for  this  particular  display  manipulation).  This  last  finding  is  important  given  the 


limited  availability  of  surgeons  to  perform  evaluation  experiments.  The  NASA-TLX  was  therefore  carried  over  to 
the  current  year’s  research,  where  it  is  being  used  in  a  simulated  surgical  environment. 


We  also  developed  and  evaluated  a  secondary  task  measure  requiring  detection  of  slowly  expanding  circles 
presented  to  research  participants’  visual  peripheries.  The  assumption  behind  this  measure  is  that  when  a  primary 
surgical  task  requires  more  effort,  these  peripheral  targets  will  go  undetected  for  longer  periods  of  time.  The 
advantage  of  such  secondary  tasks  for  workload  assessment,  compared  to  subjective  measures,  is  that  they  are 
diagnostic  of  the  specific  types  of  mental  workload  that  may  be  reduced  or  increased  by  changes  in  instrumentation. 
In  our  case,  because  of  our  focus  on  display  innovation,  we  are  seeking  measures  that  are  fairly  specific  to  changes 
in  visuospatial  cognitive  effort.  Initial  experiments  using  the  lag  task  described  above  revealed  that  we  needed  to 
modify  the  eccentricity,  growth  rate,  and  contrast  of  the  peripheral  targets  in  order  to  make  the  task  sufficiently 
sensitive.  These  modifications  were  made  during  the  current  year  as  we  moved  this  secondary  task  from  the  highly 
abstract,  laboratory  setting  used  for  the  initial  tests  to  the  current  simulated  surgical  environment. 

The  current  year  has  seen  the  further  development  of  the  peripheral  detection  task  to  accommodate  1)  the 
environmental  constraints  of  testing  research  participants  on  an  actual  surgical  trainer  and  2)  using  views  from  the 
endoscope  that  range  from  traditional  single-monitor  displays  to  the  large,  multi-projector  tiled  displays  being 
developed  for  this  project.  To  do  this,  we  have  developed  a  “donut  display”  (or  display  annulus)  that  restricts  the 
peripheral  targets  to  a  fixed  band  that  encircles  any  display  that  we  would  like  to  evaluate.  These  peripheral  targets 
are  now  presented  independently  of  the  central  (laparoscopic)  displays  allowing  for  controlled  comparisons  of 
workload  between  radically  different  display  formats. 

Our  current  belief  is  that  the  combination  of  the  NASA-TLX  and  the  peripheral  detection  tasks  will  form  our  “gold 
standard”  for  workload  assessment  in  a  simulated  surgical  environment.  Note  that  the  simulated  environment  is 
critical  for  rapid  test  and  evaluation  of  new  displays  during  design  iterations  in  our  lab.  However,  we  recognize  that 
in  an  actual  surgical  environment,  the  secondary  task  we  are  using,  although  sensitive,  is  likely  to  be  too  intrusive 
for  acceptance  by  surgeons.  Therefore,  we  are  also  developing  a  simpler  secondary  task  for  inclusion  during  actual 
surgeries.  This  task  involves  time  estimation.  Cognitive  engagement  in  a  task  redirects  mental  resources  from  our 
normal  internal  time-keeping  and,  as  a  result,  people  tend  to  underestimate  the  duration  of  lapsed  time.  This 
tendency  increases  predictably  with  increased  mental  workload. 

Consultation  with  surgeons  at  the  University  of  Maryland  has  indicated  that  mental  "time-keeping”,  which  is  an 
intrinsic  part  of  the  self-pacing  that  characterizes  some  surgeons’  behaviors,  will  be  more  readily  accepted  by  this 
population  than  other  secondary  tasks.  Our  time  estimation  task  is  currently  being  implemented  at  the  University  of 
Maryland  for  evaluation  of  workload  changes  as  a  function  of  skill  level  among  surgeons.  We  intend  to  compare 
these  results  with  those  we  are  currently  collecting  in  our  lab  to  validate  the  tool  prior  to  use  in  operational 
evaluations  of  the  REVEAL  displays  at  the  University  of  Maryland. 

In  addition  to  the  various  measures  described  above,  we  have  also  developed  a  rapid  training  protocol  for  allowing 
nonsurgeon  research  participants  to  attain  minimal  skill  in  the  use  of  laparoscopic  instruments  for  simple  peg¬ 
positioning  tasks.  Such  tasks  are  used  in  standard  assessments  of  laparoscopic  skill,  and  therefore  are  part  of  a 
battery  of  tasks  we  will  use  in  evaluating  laparoscopic  displays.  This  training  protocol  is  critical  because  the  use  of 
actual  surgeons  for  every  evaluation  is  prohibitive.  Training  of  five  participants  with  no  prior  knowledge  of 
endoscopic  surgery  revealed  that  acceptable  performance  levels  could  be  achieved  after  a  two-hour  training  session. 
In  other  words,  after  our  two  hour  training  session,  these  subjects  could  be  used  in  initial  assessments  of  workload 
changes  associated  with  changes  in  display  formats. 


Development  and  Selection  of  Stress  and  Situation  Awareness  Measures 

We  have  been  concurrently  developing  stress  and  situation  awareness  measures  appropriate  for  the  surgical 
environment.  Although  workload  is  a  primary  concern,  ideal  outcomes  from  display  innovations  would  also  involve 
reduced  stress  and  enhanced  situation  awareness.  We  have  had  success  with  the  Short  Stress  State  Inventory  during 
the  testing  of  our  laparoscopic  training  protocol.  This  measure  is  a  subjective  stress  indicator  that  looks  at  affective, 
cognitive,  and  motivational  aspects  of  stress  and  is  a  brief  version  of  the  Dundee  Stress  State  Inventory. 


Still  in  the  early  stages  of  development  is  a  new  measure  of  situation  awareness  that  should  be  applicable  in  a  variety 
of  simulated  surgical  settings  (i.e.,  conventional  and  laparoscopic).  Situation  awareness,  or  the  perception  and  recall 
of  evolving  events,  is  considered  critical  for  planning,  decision  making,  reacting  to  unexpected  events,  and 
recovering  from  mistakes  or  accidents.  Situation  awareness  has  traditionally  been  assessed  using  a  procedure  that 
blanks  research  participants’  views  of  their  tasks  and  requires  them  to  answer  a  series  of  questions  about  what  was 
happening  just  before  the  blanking  (and  what  they  predict  will  happen  in  the  near  future).  Like  many  of  the 
traditional  workload  measures,  we  feel  this  procedure  is  too  intrusive  to  use  in  many  surgical  contexts.  Further,  the 
development  of  appropriate  questions  requires  time  consuming  analyses  that  will  have  to  be  repeated  for  each  new 
surgical  task  or  simulated  scenario.  Instead,  we  are  exploring  the  idea  that  task  interruptions  per  se,  without 
associated  questions,  could  be  used  to  assess  situation  awareness.  That  is,  when  situation  awareness  is  high, 
recovery  from  these  disruptions  should  be  nearly  seamless.  We  have  recently  collected  data  from  eight  subjects 
performing  a  simple  video  game.  We  used  both  traditional  and  modified  situation  awareness  measures  and  hope  to 
determine  whether  the  measures  are  comparable.  Data  analyses  have  not  been  completed  at  this  as  we  are  still 
collecting  data.  However,  if  the  new  and  old  procedures  provide  substantially  similar  results,  then  we  will 
recommend  the  use  of  the  new  measure  for  assessment  of  possible  changes  in  situation  awareness  that  might 
accompany  changes  in  operating  room  technology. 

3.  Key  Research  Accomplishments:  Project  Milestones  2005 

The  2005  project  plan  included  two  distinct  sets  of  milestones,  one  for  visualization  technology  development 
activities,  and  a  second  for  ergonomic  assessment  activities.  The  milestones  and  our  summary  of  progress  in 
reaching  each  are  assessed  in  the  sections  that  follow. 

Primary  Milestones:  Visualization  Technology 

1)  Deploy  and  evaluate  first  iteration  of  cluster-based  distributed  architectural  framework 

A  cluster-based  system  with  multi -projector  OpenGL  display  capabilities  was  deployed  in  the 
UMMC  simulation  center.  Staff  from  the  UMMC  department  of  general  surgery  have  begun 
experimenting  with  the  system  and  provided  feedback  to  the  UK-based  developers.  A  system  is  in 
place  to  gather  this  feedback  and  use  the  information  to  improve  the  system  architecture  and 
implementation. 

2)  Deploy  and  evaluate  display  back-end  for  probe  camera  data 

The  system  deployed  at  UMMC  includes  support  for  input  and  display  of  real-time,  live  video 
feeds  from  endoscopic  camera  probes.  Staff  from  the  UMMC  department  of  general  surgery  have 
begun  experimenting  with  the  display  of  laparoscopic  camera  images  and  provided  feedback  to  the 
UK-based  developers.  A  system  is  in  place  to  gather  this  feedback  and  use  the  information  to 
improve  the  system  architecture  and  implementation. 

3)  Integrate  stereo  probe  device  support  into  acquisition  and  back-end  display  framework 

The  “SmartStereo”  application  developed  this  year  provides  an  interface  for  synchronized  stereo 
video  streams  from  stereoscopic  laparoscopic  camera  probes.  The  synchronized  stereo  pairs  can 
either  be  displayed  directly  using  a  stereo  projection  system  as  we  have  demonstrated  in  our  lab, 
or  they  can  be  used  to  perform  real-time  behind  the  scenes  stereo  reconstruction.  Software  to 
perform  stereo  reconstruction  and  extract  measurement  information  is  currently  in  an  advanced 
stage  of  development. 

4)  Design  and  test  algorithms  for  low-latency  probe-data  cue  extraction  (reconstruction,  enhancement, 
overlays) 


As  described  under  “Stereo  Probe  Calibration,”  above,  we  have  been  developing  the  necessary 
framework  to  calibrate  our  stereoscopic  laparoscope,  compute  a  reconstructed  three-dimensional 


scene,  and  extract  measurements  from  a  surgical  scene  reconstructed  in  this  way.  We  have 
recently  documented  the  results  of  the  calibration  process  and  analysis  of  the  results  of 
reconstruction  in  an  extended  abstract  submitted  to  Computer  Assisted  Radiology  and  Surgery. 

5)  Design  and  test  algorithms  for  non-invasive  ergonomic  cue  extraction  in  simulation/surgical  setting 

Original  plans  foresaw  the  possibility  of  augmenting  externally  observed  ergonomic  data  with 
information  that  could  be  inferred  from  the  interior  view  of  the  surgical  scene.  However,  a 
detailed  analysis  of  the  capabilities  of  commercial  motion  capture  systems  has  lead  us  to  conclude 
that  extraction  of  information  from  the  interior  scene  to  augment  external  observations  will  not  be 
necessary.  The  Vicon  system  deployed  in  the  UMMC  Simulation  Center  this  year  is  capable  of 
tracking  and  recording  position  in  real-time  for  all  of  the  desired  fiducials  with  sub-millimeter 
precision. 

6)  Design  display  mode  alternatives  for  same-display  integration  of  probe-data,  extracted  cue  data  and  other 
overlay  information 

Work  began  this  year  on  architecture  and  design  of  a  user  interface  that  will  integrate  real-time 
probe  camera  data  with  additional  information  sources  computed  on-line,  or  computed  pre- 
operatively.  The  architecture  was  documented  in  the  journal  article  “Computing  Support  for 
Information-Rich  Laparoscopy”  (see  Publications  section  for  bibliographic  reference).  Design 
work  is  develop  a  plan  to  implement  the  described  architecture  is  ongoing. 

7)  Test  and  support  performance  analysis  of  trial  environment  configurations:  stereo,  display  configurations, 
integrated  cues 

The  previous  section  described  analytic  models  developed  for  performance  analysis  of 
REVEAL’S  display  architecture.  Experimental  work  with  the  current  models,  and  the 
development  of  additional  models,  is  ongoing. 


Primary  Milestones:  Ergonomic  Assessment 

1)  Organize  performance  trials  on  baseline  tasks 

The  previous  section  describes  the  experimental  and  computational  environment  in  place  in  this 
project  year  for  the  measurement  of  key  data  in  order  to  assess  ergonomic  features.  Initial 
experiments  and  baseline  trials  demonstrate  the  equipment  and  initial  findings. 

2)  Assess  latency  impact  of  distributed  architecture 

Initial  latency  analysis  on  deployed  hardware  (primarily  display  system  and  measurement 
environment)  in  progress. 

3)  Upgrade  hardware/software  environments 

Visually  Immersive  Blended  Environment  (VIBE)  display  architecture  deployed  and 
demonstrated.  Integration  with  measurement  environment  underway,  including  latency  analysis. 

4)  Conduct  implementation  tests  on  stereo-probe  acquisition  and  display 

Stereo  display  testing  and  deployment  still  in  progress  (this  milestone  is  not  yet  complete) 


5)  Design  and  perform  stereo-based  human  skills  study 


Stereo  skills  tests  and  baseline  performance  study  still  not  completed  (this  milestone  is 
dependent  upon  a  completed  milestone  4) 


6)  Design  and  test  non-invasive  assessment  algorithms 

Complete  experimental  deployment  (Vicon  tracking  system)  at  U  Maryland  and  substantial 
progress  on  ergonomic  assessment  models,  including  cognitive  skills  testing. 

7)  Design  and  perform  human  skills  study  with  trial  environment  configurations:  display,  integrated  cues, 
stereo 

Initials  human  skills  study  complete  to  show  basic  ergonomic  features  and  the  correct 
measurement  environment  with  baseline  skills.  Complete  study  with  environment 
configurations  not  yet  completed  (studies  designed  to  elicit  comparative  data  between  and 
among  configurations) 

4.  Reportable  Outcomes 

Outcomes  detailed  above  can  be  summarize  as  follows: 

1 .  Experimental  System  deployment 

We  have  deployed  a  complete  experimental  system  and  have  conducted  initial  baseline  tasks  and  trials. 

2.  REVEAL  Tool  Suite  Improvements 

We  have  improved  the  REVEAL  tools  suite  in  order  to  support  new  devices,  facilitate  installation,  and 
support  basic  development  for  endoscopic  applications. 

3.  Stereo  Video  Display  Technology  Upgrades 

We  have  implemented  stereo  display  technology  in  the  laboratory  and  have  upgraded  display  surfaces  and 
polarization  capabilities. 

4.  Stereo  Probe  Calibration 

We  have  implemented  and  tested  stereo  calibration  algorithms  for  integrating  measurement  capabilities  into 
the  laparoscopic  environment.  These  results  can  be  used  to  solve  open  problems  such  as  through-the-lens 
(direct)  measurement  of  features. 

5.  Performance  Modeling  and  Analysis 

We  have  conducted  a  detailed  analysis  of  the  system  performance  of  the  distributed  computing  environment 
(which  supports  VIBE,  the  tile-based  display  system).  These  latency  data  reveal  specific  points  in  the 
system  where  bottlenecks  occur  and  where  improvements  can  be  made. 

6.  Enhanced  Experimental  Tools  for  Cognitive  Ergonomics  Experiments 

We  have  produced  a  set  of  cognitive  metrics  for  experimentally  assessing  features  such  as  mental  workload. 

5.  Conclusions 

We  have  established  an  extensive  software  and  hardware  platform  for  the  measurement  of  both  cognitive  and 
physical  ergonomic  factors  in  laparoscopic  surgery.  This  environment  contains  several  new  technologies  being 
developed  under  this  contract,  including  a  scalable,  flexible  tile-based  display  system,  calibration  algorithms  for 
stereo  scopes,  and  a  distributed  architecture  in  which  to  perform  computations.  The  information  technology 
development  team  has  worked  closely  with  the  clinical  team  at  the  University  of  Maryland  to  set  up  an  experimental 
space  which  we  have  used  to  conduct  initial  experiments.  The  data  from  these  experiments  are  summarized  above 
and  are  reported  in  the  literature.  We  report  that  project  milestones  were  substantially  completed  during  this 
reporting  period. 


6.  References 

Publications  referenced  in  this  report  are  listed  in  the  bibliographic  section  of  the  appendix  (appendix  C).  One 
publication  is  attached  as  the  final  appendix. 
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Abstract 

Developments  in  the  architecture  and  organization  of  high  performance  general  purpose  com¬ 
puter  systems  are  largely  ignored  by  the  technology  infrastructure  of  the  modern  laparoscopic 
surgical  suite.  The  current  state  of  technology  for  laparoscopy  is  a  camera  and  monitor  linked 
via  a  controller  that  distributes  analog  or  digital  video  signals  without  regard  to  their  content. 
This  article  discusses  the  opportunities  that  will  be  created  by  inserting  general  purpose  high 
performance  computing  into  the  information  stream  between  camera  and  display.  Using  this 
technology  we  envision  a  radical  transformation  of  laparoscopy  from  its  current  state  as  “surgery 
by  pictures”  into  an  entirely  new,  information-rich  surgical  environment. 


1  Introduction 

Consider  the  evolution  of  sports  broadcasting  over  the  last  thirty  years.  A  football  broadcast,  circa 
1970,  consisted  of  a  low-resolution  video  image  of  the  playing  field,  voice-over  and  the  occasional 
superimposed  game  score  or  play-clock  value.  Now,  fast-forward  to  today’s  HDTV  broadcast  with 
constantly  updated  information  fields  and  key  game  information  superimposed  visually  on  the 
playing  field.  The  line  of  scrimmage  in  red,  a  line  across  the  field  indicating  the  position  of  the  first 
down  marker  in  yellow,  broadcaster-specific  advertising  images  appearing  on  virtual  banners  along 
the  sidelines. 

Non-invasive,  information  rich  displays  of  the  game  have  enhanced  the  viewing  experience  for 
home  viewers  without  requiring  any  on-field  support;  the  very  literal  “line”  of  scrimmage  that 
you  see  on  your  screen  doesn’t  require  anyone  to  walk  out  on  the  field  and  paint  the  grass.  These 
enhanced  views  are  made  possible  through  the  application  of  computer  image  processing  techniques. 

Now,  consider  the  state  of  the  art  in  displays  for  laparoscopic  surgery;  a  low  resolution  video 
image  (evolving  toward  HDTV)  is  routed  directly  from  a  camera  to  a  monitor.  There  are  occasional 
informational  displays  related  to  the  state  of  the  laparoscopic  hardware,  but  no  real  augmentation 
of  the  displayed  image  to  support  the  surgeon.  Clearly,  the  laparoscopic  surgeon  could  benefit 
from  information  enriched  displays  every  bit  as  much  as  the  sports  fan,  yet  the  state  of  the  art  in 
laparoscopic  displays  is  thirty  or  more  years  behind  sports  broadcasting. 

The  two  main  impediments  to  this  evolution  in  technology  are  (1)  the  lack  of  a  standard, 
open  computing  infrastructure  in  the  data  stream  between  camera  and  display;  and  (2)  the  lack  of 


Figure  1:  Current  architecture  for  laparoscopic  display 


available  display  area  for  the  presentation  of  additional  information,  while  preserving  the  present 
level  of  video  fidelity. 

In  this  paper  we  present  a  general  purpose  system  architecture  to  support  the  evolution  of 
laparoscopic  surgery  away  from  “surgery  by  pictures”  toward  an  entirely  new,  information-rich 
surgical  environment.  The  architecture  will  be  based  on  low-cost,  open,  extensible  components 
that  will  provide  the  flexibility  needed  to  support  a  diverse  range  of  enhanced  views.  These  views 
will  be  based  on  information  sources  such  as  pre-operative  scans,  enhanced  intraoperative  imaging, 
and  other  on-line  information  sources. 

The  remainder  of  this  paper  is  organized  as  follows.  Section  2  proposes  a  hardware  and  software 
architecture  to  support  an  information-rich  surgical  environment.  Section  3  will  describe  applica¬ 
tions  enabled  by  the  proposed  architecture.  Section  4  presents  a  case  study  based  on  a  prototype 
environment  deployed  in  the  University  of  Maryland  Medical  Center’s  simulation  environment. 
Section  5  will  conclude  the  paper  with  summary  remarks. 

2  Architecture 

Figure  1  presents  an  abstract  view  of  the  typical  laparoscopic  display  system.  As  shown,  (1)  a  cam¬ 
era  captures  video  images  of  the  surgical  site,  (2)  sends  that  stream  of  images  to  a  controller  which 
may  provide  low-level  image  enhancement  features  such  as  white-balance  calibration,  brightness, 
and  color  reproduction  adjustments,  and  (3)  the  controller  distributes  the  output  video  signal  to 
one  or  more  display  devices.  This  system  has  several  virtues,  including  the  following: 

•  Simplicity  of  architecture — Vendor  supplied  systems  consist  of  a  small  number  of  components 
that  interconnect  through  simple  physical  interfaces.  The  skills  needed  to  set  up  and  support 
such  systems  are  minimal. 

•  Simplicity  of  interface — Since  the  systems  provide  only  one  function,  i.e.,  display  of  video 
from  a  laparoscope,  the  user  interface  is  limited  to  switching  the  unit  on  and  off,  and  minor 
set-and-forget  adjustments  for  brightness,  etc.  Otherwise  the  only  user  control  is  the  physical 
motion  of  the  camera  to  provide  the  desired  view. 

•  Low  latency — The  term  latency  refers  to  the  time  delay  introduced  between  the  capture  of  a 
video  frame  and  its  presentation  on  the  display  device.  Since  current  systems  are  so  simple 
in  their  design  and  operation,  the  latency  they  introduce  is  not  typically  noticed  by  the  user. 
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Latency  does  exist,  however,  in  the  form  of  delays  due  to  digital-to-analog  conversion  at  the 
camera,  image  processing  in  the  controller,  analog-to-digital  conversion  at  the  display  device, 
and  small  propagation  delays  throughout  the  chain  from  camera  to  display. 

•  Interoperability — Cameras  and  controllers  are  typically  interconnected  using  a  proprietary 
interface.  However,  once  the  controller  has  converted  the  video  into  standard  video  signals 
and  presented  the  signal  on  standard  connectors  any  standard  video  device  can  be  used  to 
capture  or  display  the  signals. 

However,  along  with  these  virtues  come  severe  limitations  on  the  ability  to  enhance  the  presented 
image: 

•  No  capacity  for  additional  information  sources — Current  systems  do  not  provide  any  way  to 
access  additional  data  sources  such  as  pre-operative  scans.  The  only  input  to  the  system  is 
the  image  stream  from  the  camera  probe. 

•  No  capacity  for  image  processing — The  only  image  processing  stage  in  current  systems  is  the 
limited  video  processing  that  takes  place  inside  the  controller.  Controllers  are  provided  as  a 
closed  system,  not  allowing  adaptation  or  enhancement. 

•  No  flexibility  in  display — Current  systems  map  the  captured  video  image  directly  to  the 
entire  frame  of  a  video  display.  This  configuration  doesn’t  leave  any  screen  area  to  be  used 
for  displaying  added  information,  and  limits  displayed  images  to  those  that  can  be  shown  on 
a  conventional  video  display.  Thus,  it  is  not  possible  to  present  enhanced  images,  such  as 
polarized  stereo. 

Since  January  of  2004  the  Reconstruction,  Enhancement,  Visualization  and  Ergonomic  Assess¬ 
ment  for  Laparoscopy  project  (REVEAL)  at  the  University  of  Kentucky’s  Center  for  Visualization 
and  Virtual  Environments  has  been  researching  image  processing  and  display  technologies  to  im¬ 
prove  the  practice  of  laparoscopic  surgery.  In  the  course  of  this  work  we  have  evolved  a  new 
system  architecture  that  attempts  to  address  the  limitations  of  existing  systems  while  retaining 
their  virtues  that  are  consistent  with  the  creation  of  an  information-rich  environment. 

The  system  architecture  we  propose  is  shown  in  Figure  2.  It  consists  of  hardware  as  well  as 
software  components  that  will  facilitate  the  enhancement  and  augmentation  of  captured  images. 
At  a  high  level,  the  key  components  are  (1)  one  or  more  cameras,  either  monocular  or  stereoscopic; 
(2)  an  image  processor  capable  of  receiving  multiple  input  sources  and  constructing  new,  synthetic 
images  that  represent  meaningful  enhancements  to  the  video  input  stream;  (3)  an  interaction  station 
where  the  surgeon  can  be  assisted  in  the  manipulation  of  images  by  a  staff  member  working  outside 
of  the  sterile  surgical  field;  and  (4)  an  extensible,  projected  display  that  will  allow  the  image  size 
and/or  pixel  density  of  displayed  images  to  be  varied  to  suit  task  requirements. 

The  subsections  that  follow  describe  the  hardware  and  software  components  of  this  system 
architecture. 

2.1  Hardware 

As  shown  in  Figure  3,  the  hardware  environment  consists  of  (1)  one  or  more  cameras  (monocular 
or  stereoscopic)  providing  a  stream  of  video  images  of  the  surgery;  (2)  dedicated  controllers  for 
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Figure  2:  The  REVEAL  system  architecture 


bridging  video  data  onto  the  backbone  network;  (3)  general  purpose  computing  nodes  for  image 
processing,  data  processing,  and  image  composition;  (4)  a  generic,  non-sterile  workstation  for  con¬ 
trol  of  algorithms  and  data  streams;  and  (5)  special  purpose  image  processing  hardware  for  taking 
processed  images  off  the  network,  segmenting  them  and  outputting  the  segments  to  casually  aligned 
projectors  for  display. 

Each  of  these  components  can  be  created  using  commercial,  off  the  shelf  (COTS)  technology  or 
application  specific  custom  developed  hardware,  depending  on  the  performance  requirements  and 
cost  constraints  of  the  environment  being  created.  Key  design  constraints  when  provisioning  such 
a  system  are  as  follows: 

•  Availability  of  input  streams  with  meta-data — Input  streams,  be  it  laparoscope  video,  preop¬ 
erative  scan  data,  or  other  information  sources,  must  be  interfaced  to  the  system  in  a  way  that 
presents  data  to  the  system  with  predictable  real-time  performance.  Not  all  data  must  be 
received  instantaneously  after  its  creation,  but  the  system  must  have  a  reliable,  quantitative 
way  of  understanding  the  temporal  quality  of  data  in  terms  of  frame  rate,  latency,  jitter,  etc. 

•  Network  connectivity,  throughput  and  latency — Since  the  backbone  network  is  central  to  the 
hardware  system,  its  performance  will  be  critical  to  the  overall  quality  of  service  delivered  by 
the  system.  It  must  be  possible  to  connect  high-resolution  data  sources  to  computing  nodes, 
and  computing  nodes  to  image  processing  nodes  in  a  way  that  will  deliver  the  necessary 
throughput  with  minimal  latency. 

•  Overall  performance — Laparoscopic  surgery  is  a  closed-loop;  i.e.,  images  are  used  by  the 
surgeon  to  decide  how  to  move  the  instruments,  and  the  instrument  motions  appear  in  the 
images  which  are  viewed  by  the  surgeon.  Any  human  in  the  loop  closed  loop  control  system 
must  function  with  latency  that  is  below  the  200ms  level  [3,  5,  8,  9].  Latency  above  that  level 
will  result  in  degraded  performance. 
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Figure  3:  The  REVEAL  hardware  architecture 

Throughput  will  also  be  critical;  i.e.,  the  rate  at  which  the  system  is  able  to  display  image 
frames.  Current  technology  presents  information  at  the  standard  rate  of  thirty  frames  per 
second.  Image  processing  must  not  degrade  throughput  to  the  point  that  a  low  frame  rate 
makes  the  image  appear  “jumpy”  rather  than  fluid. 

•  Display  fidelity — Typical  COTS  projectors  are  optimized  for  applications  like  PowerPoint 
presentations.  Their  use  for  laparoscopic  image  display  can  be  problematic  if  steps  are  not 
taken  to  control  for  variations  in  color  representation,  variations  in  overall  brightness  between 
projectors,  variations  in  brightness  across  bulb  life,  variations  in  pixel  geometry  between 
projectors,  etc. 

2.2  Software 

As  shown  in  Figure  4,  the  key  components  of  the  software  architecture  are  (1)  a  laparoscopy 
workbench  tool  for  integrating  data  streams  and  controlling  algorithms,  and  (2)  a  distributed 
rendering  environment  for  segmenting  and  displaying  images  using  casually  aligned  projectors. 

The  interface  of  the  laparoscopy  workbench  will  be  presented  at  the  non-sterile  workstation. 
A  technician  will  aid  the  surgeon  by  selecting  data  sources,  inputting  parameters  to  control  image 
processing  algorithms,  and  manipulating  images  in  real-time  at  the  direction  of  the  surgeon.  The 
workbench  will  be  implemented  using  an  open  architecture  that  will  allow  independent  development 
of  features  that  can  be  incorporate  at  run-time  as  plug-ins. 

Our  distributed  rendering  environment  uses  sensor  feedback  to  automate  the  calibration  of  ca¬ 
sually  aligned  projectors  to  form  a  single,  blended  display  surface  [4,  7,  1].  A  pre-surgery  calibration 
phase  will  project  a  calibration  pattern  using  each  projector,  and  video  images  captured  by  cameras 
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Figure  4:  The  REVEAL  software  architecture 


observing  the  display  will  be  used  to  compute  necessary  transformations  to  align  image  segments 
and  blend  their  intensities  in  real-time.  During  the  surgery  the  use  of  multiple  image  processing 
nodes  to  display  the  partitioned  image  will  improve  throughput  and  latency  by  distributing  the 
rendering  workload  across  multiple  nodes,  each  having  a  direct  connection  to  a  single  projector. 

3  Applications 

The  laparoscopy  workbench  will  be  a  framework  for  creating  plug-in  modules  that  provide  function¬ 
ality  on  an  as-needed  basis  during  the  surgery.  The  framework  will  provide  a  standard  interface 
to  (1)  video  streams,  still  image  data,  XML-encoded  data  streams,  etc.;  (2)  access  to  standard 
user  interface  devices  such  as  keyboard,  mouse,  foot  pedals,  etc.;  and  (3)  low-level  application 
programming  interfaces  to  classic  image  processing  algorithms. 

Within  this  framework  users  will  create  plug-in  modules  that  implement  specific  functionality. 
Modules  that  we  envision  include  the  following: 

•  Image  fusion — The  two-dimensional,  real-time  image  of  anatomy  captured  by  the  laparoscope 
lacks  the  three-dimensional  structural  information  of  CT,  MRI  or  even  X-Ray  imaging.  Merg¬ 
ing  of  off-line  scan  data  with  real-time  video  images  would  allow  the  surgeon  to  see  into  or 
around  anatomy  to  better  understand  structure  before  making  critical  incisions.  A  long-term 
goal  of  real-time  registration  of  volumetric  scan  data  with  the  images  for  the  surgery  as  it  is 
happening  may  be  unrealistic  for  the  near  term,  but  even  off-line  registration  of  still  images 
from  the  surgery  with  pre-operative  scans  in  near  real-time  could  provide  critical  information 
to  the  surgeon  intraoperatively. 

•  Real-time  video  conferencing — A  larger  display  space,  only  partly  occupied  by  the  images 
from  the  laparoscope,  would  provide  a  tableau  for  adding  other  information  sources.  For 
example,  we  could  envision  a  virtual  desktop  displaying  full-resolution,  full-size  laparoscope 
images  along  side  images  from  a  real-time  video-teleconference  consultation  with  colleagues, 
or  interactive  display  of  medical  students  at  remote  locations  during  an  interactive  web-cast. 


6 


•  Heads-free  stereoscopic  views — Current  COTS  technology  for  minimally  invasive  surgery  using 
stereoscopic  views  is  based  on  a  head-mounted  display  that  uses  a  pair  of  small  LCD  screens  to 
present  independent  views  to  each  of  the  surgeon’s  eyes.  Wearing  the  head-mounted  display 
is  both  physically  cumbersome,  and  potentially  disruptive  to  the  surgeon’s  interaction  with 
other  members  of  the  surgical  team. 

Using  projected  displays  with  polarizing  filters  to  control  the  polarity  of  the  light  creating 
images  for  the  left  and  right  eyes,  stereoscopic  views  can  be  presented  to  the  entire  surgical 
team.  Rather  than  having  to  wear  heavy,  wired  headsets  the  team  can  simply  wear  lightweight 
polarized  sunglasses.  When  viewing  the  display  made  up  of  polarized  light,  the  wearers  will 
perceive  distinct  images  at  their  left  and  right  eyes.  When  viewing  objects  illuminated  by 
non-polarized  light,  they  will  observe  a  normal  scene  with  only  slight  attenuation  of  the  scene’s 
brightness. 

•  Non-invasive  linear  measurement — The  availability  of  low-cost  stereoscopic  laparoscopes  fa¬ 
cilitates  the  reconstruction  of  three-dimensional  anatomy  from  stereo  image  pairs.  These 
stereo  image  pairs  need  not  be  displayed  in  stereo  to  the  end  user  to  be  of  value;  the  recon¬ 
struction  of  three-dimensional  anatomy  can  be  used  internally  to  the  workbench  to  provide 
quantitative  dimensional  information  about  the  surgical  site. 

In  laparoscopic  ventral  and  incisional  hernia  repair  [6,  2],  a  small  patch  of  mesh  is  sewn  into  the 
abdominal  wall  to  repair  the  defect  that  allowed  the  herniation  to  occur.  Intraoperatively, 
the  size  of  the  defect  must  be  determined  so  that  a  mesh  patch  of  the  proper  size  can  be 
introduced  into  the  surgical  site  through  a  trocar.  The  determination  of  the  dimensions  of 
the  defect  is  currently  performed  using  a  tape  measure,  manipulated  using  graspers.  This 
operation  is  tedious,  time  consuming,  and  requires  the  introduction  and  removal  of  the  tape 
to  perform  the  measurement. 

Using  a  stereoscopic  laparoscope  and  real-time  reconstruction  of  the  three-dimensional  anatomy 
will  allow  measurements  to  be  taken  using  virtual  measuring  tapes.  The  surgeon  will  direct 
the  camera  at  the  defect  to  be  measured,  and  the  assisting  technician  will  indicate  the  limit 
points  of  the  defect  on  the  image  of  the  anatomy.  Image  processing  algorithms  will  then 
locate  the  indicated  points  on  the  reconstructed  anatomy  and  report  the  exact  distance  be¬ 
tween  points.  All  of  this  will  take  place  in  the  digital  domain  with  just  a  few  manipulations 
of  the  controls  of  the  laparoscopic  workbench. 


4  Case  Study:  Heads-free  Stereoscopic  Display 

We  have  created  a  heads-free  stereoscopic  display  based  on  an  early  prototype  of  the  architecture 
described.  The  system  consists  of  (1)  a  Vista  Surgical  lOmrn  stereoscopic  endoscope;  (2)  a  Vista 
Surgical  camera  controller;  (3)  digital  video  link  from  the  camera  controller  to  a  general  purpose 
workstation  designated  the  ’’head  node”  of  the  display  cluster;  (4)  a  high-density  cluster  of  com¬ 
puters  equipped  with  independent  display  driver  hardware;  and  (5)  eight  off-the-shelf  SVGA  video 
projectors  equipped  with  polarizing  filters.  Observers  wear  lightweight,  low-cost  polarized  glasses 
to  view  the  projected  images. 

System  throughput  supports  a  frame  rate  of  15-20  frames  per  second.  Latency  is  in  the  range 
of  120-165ms.  Figure  5  presents  the  processing  stream  from  capture  to  display.  Note  that  latency 
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Figure  5:  Heads-free  Stereoscopic  Display  System 


of  the  overall  system  contains  a  large  component  due  to  latency  introduced  by  the  camera  and 
projector.  This  is  baseline  latency  that  exists  in  current  camera/display  configurations. 

In  our  laboratory  we  used  a  polarity-preserving  back-projected  screen  and  two  sets  of  four 
projectors  to  create  a  stereo  video  display  measuring  approximately  six  feet  by  four  feet.  The 
use  of  multiple  projectors  to  illuminate  the  display  area  insured  that  image  brightness  was  not 
negatively  impacted  by  the  relatively  large  display  area. 

5  Summary 

In  this  article  we  have  presented  a  high  level  system  architecture  to  present  information-rich  displays 
during  laparoscopic  surgery.  The  system  is  flexible  and  extensible,  and  self-calibrates  the  display 
area  using  video  feedback  allowing  projectors  to  be  set  up  without  elaborate  preconfiguration. 
The  software  architecture  allows  plug-and-play  compatibility  between  software  modules  developed 
independently  of  one  another. 

The  creation  of  information  rich  displays  aims  to  replace  some  of  the  information  lost  in  the 
move  away  from  open  surgery  toward  minimally  invasive  procedures.  We  expect  such  displays  to 
have  a  quantifiable  impact  on  patient  time  under  anesthesia,  and  the  cognitive  workload  imposed 
on  surgeons  during  surgery. 
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